MayBMS: A System for Managing Large Uncertain and Probabilistic Databases
نویسنده
چکیده
MayBMS is a state-of-the-art probabilistic database management system that has been built as an extension of Postgres, an open-source relational database management system. MayBMS follows a principled approach to leveraging the strengths of previous database research for achieving scalability. This article describes the main goals of this project, the design of query and update language, efficient exact and approximate query processing, and algorithmic and systems aspects. Acknowledgments. My collaborators on the MayBMS project are Dan Olteanu (Oxford University), Lyublena Antova (Cornell), Jiewen Huang (Oxford), and Michaela Goetz (Cornell). Thomas Jansen and Ali Baran Sari are alumni of the MayBMS team. I thank Dan Suciu for the inspirational talk he gave at a Dagstuhl seminar in February of 2005, which triggered my interest in probabilistic databases and the start of the project. I am also indebted to Joseph Halpern for insightful discussions. The project was previously supported by German Science Foundation (DFG) grant KO 3491/1-1 and by funding provided by the Center for Bioinformatics (ZBI) at Saarland University. It is currently supported by NSF grant IIS-0812272, a KDD grant, and a gift from Intel.
منابع مشابه
Scalable Statistical Modeling and Query Processing over Large Scale Uncertain Databases
Title of Dissertation: SCALABLE STATISTICAL MODELING AND QUERY PROCESSING OVER LARGE SCALE UNCERTAIN DATABASES Bhargav Kanagal Shamanna Doctor of Philosophy, 2011 Dissertation directed by: Dr. Amol Deshpande Dept. of Computer Science The past decade has witnessed a large number of novel applications that generate imprecise, uncertain and incomplete data. Examples include monitoring infrastructu...
متن کاملManaging Probabilistic Data with MystiQ: The Can-Do, the Could-Do, and the Can't-Do
MystiQ is a system that allows users to define a probabilistic database, then to evaluate SQL queries over this database. MystiQ is a middleware: the data itself is stored in a standard relational database system, and MystiQ is providing the probabilistic semantics. The advantage of a middleware over a reimplementation from scratch is that it can leverage the infrastructure of an existing datab...
متن کاملManaging Continuous Uncertain Data by a Probabilistic XML Database Management System
Database systems are widely used in today’s world. Almost every information system contains one or more databases. From a traditional perspective, databases are used to store precise values about objects in the ’real world’. However, many information is uncertain or imprecise. Consider, for example, sensor applications. Sensors produce uncertain and imprecise data since readings of sensors are ...
متن کاملMayBMS: A Possible Worlds Base Management System
1. MAYBMS Incomplete information is frequent in real-world applications. This is often the case in data integration scenarios, in scientific data collections, or whenever the information is acquired using human interaction and is erroneous or imperfect. The different interpretations of incomplete information yield different possible worlds. A system for managing incomplete data faces the challe...
متن کاملSemantics Representation of Probabilistic Data by Using Topk-Queries for Uncertain Data
Database systems for uncertain and probabilistic data promise to have many applications. Query processing on uncertain data occurs in the contexts of data warehousing, data integration, and of processing data extracted from the Web. Data cleaning can be fruitfully approached as a problem of reducing uncertainty in data and requires the management and processing of large amounts of uncertain dat...
متن کامل